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This  method  has  been  applied  to  the  generation  of  a  recognition  program 
for  a  toy  wagon.  The  generated  program  has  been  tested  with  real  scenes  and  has 
recognized  the  wagon  in  a  pile. 
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Abstract 


This  paper  presents  an  approach  to  using  object-oriented  programming  for  the  generation  of  a 
object  recognition  program  that  recognizes  a  complex  3-D  object  within  a  jumbled  pile. 


We  generate  a  recognition  program  from  an  interpretation  tree  that  classifies  an  object  into  an 
a^ropriate  attitude  group,  which  has  a  similar  appearance.  Each  node  of  an  interpretation  tree 
represents  a  feature  matching.  We  convert  each  feature  extracting  or  matching  operation  into  an 
individual  processing  entity,  called  an  object.  Two  kinds  of  objeas  have  been  prepared:  data 
objeas  and  event  objects.  A  data  objea  is  used  for  representing  geometric  objects  (such  as 
edges  and  regions)  and  extracting  features  from  geometric  objeas.  An  event  object  is  used  for 
feature  matching  and  attitude  determination.  A  library  of  prototypical  objects  is  prepared  and  an 
executable  program  is  construaed  by  properly  seleaing  and  instantiating  modules  from  it.  The 
object-oriented  programming  paradigm  provides  modularity  and  extensibility. 


This  method  has  been  applied  to  the  generation  of  a  recognition  program  for  a  toy  wagon.  The 
generated  program  has  been  tested  with  real  scenes  and  has  recognized  the  wagon  in  a  pile. 
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1  Introduction 

Traditionally,  a  recognition  program  is  generated  by  a  human  expert  who  examines  the 
features  of  an  object,  develops  a  strategy  for  a  recognition  procedure,  and  writes  a  specialized 
program  for  the  individual  objea.  However,  this  "hand  writing"  of  a  recognition  program 
requires  a  long  time  for  programming  and  testing.  In  order  to  reduce  the  development  time, 
several  researchers  have  investigated  methods  to  automatically  generate  recognition  programs 
from  object  models  [5,7,8]. 

Automatic  generation  of  a  recognition  program  requires  several  key  components; 

•  object  models  to  describe  the  geometric  and  photometric  properties  of  an  object  to 
be  recognized; 

•  sensor  models  to  predict  object  appearances  from  the  object  model  under  a  given 
sensor; 

•  strategy  generation  using  the  predicted  appearances  to  produce  a  recognition 
strategy; 

•  program  generation  converting  the  recognition  strategy  to  executable  program. 

This  paper  concentrates  on  the  final  stage,  i.e.  program  generation.  We  will  investigate  a  way  to 
automatically  generate  a  program  to  localize  an  object  under  the  assumption  that  its  recognition 
strategy  is  given. 

We  propose  to  prepare  a  library  of  modules  to  be  used  for  converting  a  strategy  into  a  program 
and  to  construct  the  progra:::  by  properly  selecting  modules  from  the  library.  Our  method  is 
based  on  object-oriented  programming.  An  object  in  object-oriented  programming  is  a 
processing  unit,  which  can  store  several  internal  values  in  slots  and  execute  various  operations. 
This  paper  identifies  the  necessary  operations  in  recognition  strategies  and  prepares  the 
prototypes  of  the  objects  to  execute  the  strategies  in  the  library.  TTien,  this  paper  defutes  a 
generation  method  for  an  executable  program  by  instantiating  the  objects  in  the  library.  Finally, 
this  paper  applies  the  method  to  a  toy  wagon  to  generate  a  recognition  program  and  executes  the 
generated  program  in  a  real  scene  to  demonstrate  the  validity  of  our  method. 


2  Generating  a  Recognition  Strategy 

This  section  overviews  our  recognition  strategy  which  is  to  be  .onverted  into  an  executable 
program  in  the  following  sections.  Our  paradigm  is  to  generate  a  recognition  program  to  localize 
a  3D  object  within  a  jumbled  pile  under  the  assumption  chat  its  geometric  and  photometric 


properties,  sensor  characteristics,  and  sensing  conditions  are  known.  The  basic  recognition 
strategy  is  to  classify  one  unknown  attitude  (one  object  appearance)  into  one  of  several  possible 
attitude  groups  by  using  various  available  features,  and  then  to  determine  the  precise  attitude  by 
solving  equations  based  on  the  visible  features  of  the  group.  Each  group  consists  of  topologically 
equivalent  object  appearances  and  is  referred  to  as  an  aspea  [9]. 

Strategy  generation  is  performed  by  recursive  sub-divisions  of  possible  aspects  by  available 
features.  Strategy  generation  starts  with  a  root  node  which  contains  all  possible  aspects.  After 
that  time,  whenever  a  new  classification  is  done,  new  nodes  are  generated.  At  each  node  of  the 
interpretadon  tree,  each  available  feature  is  examined  to  determine  whether  it  can  classify  the 
group  of  aspects  in  the  node  into  a  smaller  number  of  aspeas.  If  it  can,  the  feature  is  stored  at 
the  node  and  subnodes  corresponding  to  classified  subgroups  of  aspects  are  generated  and 
connected  to  the  node.  Thus,  the  generated  recognition  strategy  is  represented  as  a  tree,  which 
we  call  an  interpretation  tree.  Intermediate  nodes  of  the  interpretation  tree  correspond  to 
classification  stages  and  leaf  nodes  correspond  to  classification  into  individual  asjsects  [7]. 

Two  kinds  of  features  are  used  for  matching:  unitary  features  and  relational  features.  A 
unitary  feature  can  be  represented  as  scalar  numbers,  such  as  area  and  moment  of  a  visible  face, 
while  a  relational  feature  is  a  detailed  relational  description  between  visible  faces,  such  as  face- 
face  relations  and  face -edge  relations. 

At  the  completion  of  the  aspert  classification,  each  intermediate  node  of  the  interpretation  tree 
records  the  feature  to  be  used  for  classification,  and  each  leaf  node  contains  one  single  aspect. 
Suppose  at  this  moment,  we  apply  the  interpretation  tree  to  one  object  appearance ^  Then,  we 
can  classify  the  appearance  into  the  corresponding  aspect  at  the  leaf  node  by  using  the  same 
features  and  values  recorded  at  each  intermediate  node  of  the  interpretation  tree. 

The  next  task  will  be  to  determine  the  exact  attitude  of  the  object  within  that  aspect.  Once  an 
appearance  is  classified  into  an  aspect,  the  interpretation  tree  knows  the  correspondence  between 
image  regions  and  object  faces,  in  particular  the  correspondence  between  the  entry  region  and 
the  corresponding  object  face.  Thus,  once  we  define  the  local  coordinates  of  the  object  face  by 
the  surface  orientation  of  the  face,  the  minimum  moment  direction  of  the  face,  and  the 


’More  precisely,  one  image  region  of  an  object  appearance  is  given  to  the  interpretation  tree.  We  will  denote  the 
image  region  from  which  the  process  begins  as  the  entry  region. 
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relationship  between  visible  faces,  we  can  recover  the  local  coordinates  relative  to  the  world 
because  those  three  piece  of  information  can  be  obtained  from  the  entry  region.  Then,  the  object 
attitude  can  be  recovered  from  the  local  coordinates  and  the  transformation  from  the  local 
coordinates  to  the  body  coordinates  of  the  object. 

After  the  exact  attitude  of  the  object  is  obtained,  the  system  generates  an  expected  image  by 
using  a  geometric  modeler.  Edges  in  the  expeacd  image  will  be  compared  with  the  edges  in  the 
input  image  to  confirm  the  recognition.  The  voting  index  method  provides  a  way  to  match  the 
expected  edges  with  the  extracted  edges  by  giving  the  reliability  of  the  recognition.  For  the 
voting  index  see  Appendix  n. 

3  Object  Library 

This  section  will  consider  how  to  convert  a  given  strategy  into  an  executable  program.  A 
recognition  strategy  is  given  as  an  interpretation  tree  in  our  system;  each  node  of  an 
interpretation  tree  contains  a  group  of  aspects  and  one  of  the  feature  matching  operations  to  be 
used.  We  will  identify  necessary  matching  operations,  and  design  objects  to  perform  the 
operations  by  using  the  object-oriented  programming  technique. 

An  objea  in  object-oriented  programming  is  a  processing  unit,  which  can  store  several  internal 
values  in  slots.  We  can  define  demon  functions  for  each  slot,  where  a  demon  function  will  be 
invoked  implicitly  when  we  retrieve  a  value  from  the  slot  or  insert  a  value  into  the  slot.  An 
object  can  execute  an  operation  explicitly  when  we  send  a  particular  message  to  the  object.  An 
object  can  be  defined  as  an  instance  of  a  prototypical  object.  An  instance  object  can  inherit  slot 
names,  slot  values,  demon  functions,  and  operations  of  the  prototypical  object^ 

Two  kinds  of  objects  are  prepared  in  our  object  library.  One  is  a  data  object,  which  is  used  in 
representing  geometric  objects  (such  as  edge  and  region)  and  extracting  features  from  geometric 
objects.  The  other  is  an  event  object,  which  is  used  to  control  the  matching  and  determine  the 
exact  attitude  after  the  interpretation. 


^There  are  several  implemcniabons  to  (he  objects.  In  our  system,  we  use  modified  Framekii+  originally  developed 
at  Carnegie  Mellon  University  [2]. 


3.1  Data  Object 

Our  system  uses  photometric  stereo  to  obtain  region  information  [6],  and  uses  a  line  extractor 
to  obtain  edge  information  [1,  10].  To  represent  these  pieces  of  information,  we  create  two 
prototypical  data  objects  in  the  object  library.  They  are  : 

•  Region 

•  Edge 

The  following  example  shows  the  definitions  of  the  two  abstract  objects;  an  abstract-region 
and  an  abstract-edge. 

(abstract-ragion-ob ject 
(is -a  program -object) 

(id-number) 

(area) 

(maximtui-x) 

(miniffltim-x) 

(maxifflum-y) 

(minimum-y) 

(mass -center) 

(moment) 

(orientation) 

(region-search-distance) 

(region-image -model -distance-coef) 

( region - image -mode 1 - area - coe  f ) 

( regi on- image -mode 1 -moment - coe  f ) 

(region -area 

(if-needed-demon  region-area- func) ) 

( region -moment 

(if-needed-demon  region-moment -func) ) 

( r egi on-moment -ratio 

(if-needed-demon  region-moment-ration-func) ) 
(region-orientation 

(if-needed-demon  region-orientation- func) ) 
(region-region-relation 

(if-needed-demon  region-region-relation-func) ) ) 

(abstract -edge-object 
(is-a  program-object) 

(id-number) 

(start -point) 

(end-point) 

(center) 

(length) 

(direction) 

(edge-region-relation 

(if-needed-demon  edge-region-relation-func) ) ) 
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In  the  definition  of  an  abstract-region,  the  is-a  slot  represents  that  this  abstract  object  is  a 
program  object.  Slots  from  id-number  through  orientation  will  store  image  properties  of 
individual  regions  by  inheritance  mechanism.  Slots  from  region-search-distance  through 
region-image-model-moment-coef  keep  global  knowledge  such  as  search  distance  for  relational 
features  or  coefficients  between  data  in  the  geometric  modeler  and  image  Hata  Slots  from 
region-area  through  region-region-relation  store  features  which  are  obtained  from  image 
properties  by  demon  fiinaions  attached  to  the  slots. 

We  can  make  instance  objects  of  these  abstract  objeas.  When  the  instance  objects  are 
generated,  the  image  prdperties  of  each  region  or  edge  are  extracted  from  an  image  and  stored  in 
the  corresponding  slots.  Thus,  for  example,  an  instance  objea  of  an  region  looks  like; 

(RIO 

(inatanca  abstract -region) 

(id-n\afflber  100) 

(maximuffl-x  100) 

(maximum-y  100) 

(minimun-x  50) 

(minimum-y  50) 

(mass-cantar  (75  75}) 

(momant  (8000  200  0.2)) 

(orientation  (0.0  0.0  1.0))) 


The  global  knowledge  and  demon  functions  can  be  accessed  from  an  instance  object  through 
the  inheritance  mechanism  if  necessary.  For  example,  if  the  feature,  region-area  of  the  instance 
object,  RIO  is  accessed  by  a  recognition  process,  there  is  no  slot  in  RIO.  Tnus,  an  inheritance 
mechanism  is  invoked  and  the  region-area  slot  of  the  abstract-region  is  accessed.  The  demon 
funaion  attached  to  the  region-area  slot  of  the  abstract-region  is  invoked.  Then,  the  demon 
function  calculates  the  region-area  of  RIO  by  using  region-image-model-area-coefficient  in  the 
abstract  region  and  the  area  value  in  RIO  and  returns  the  feature  value  to  the  recognition  process. 

This  mechanism  makes  the  access  format  of  the  image  features  (say,  region-area)  by  the 
recognition  process  independent  of  the  output  format  of  image  properties  (say,  image  area)  given 
by  a  sensor.  In  particular,  this  mechanism  is  convenient  when  we  handle  multiple  sensors.  Each 
sensor  has  a  particular  output  format  and  model-image  coefficients.  Thus,  if  we  use  tlie 
conventional  method  without  demon  functions,  we  have  to  exchange  access  functions  of  the 
recognition  process  depending  on  sensors  and  features.  However,  if  we  use  this  demon 
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mechanism,  we  not  need  to  change  the  access  functions  of  the  recognition  process;  we  only  need 
to  redefine  demon  functions.  Since  the  global  knowledges  and  all  the  demon  funaions  are 
anached  only  to  the  abstract  region  and  the  abstract  edge,  necessary  changes  are  localized  at  the 
level  of  the  abstract  region  and  abstract  edge. 

The  relational  features  such  as  region-region  or  region-edge  are  also  represented  by  using 
demon  functions.  These  relational  features  are  represented  relatively  with  respect  to  each  region. 
If  we  use  the  conventional  method,  we  have  to  calculate  all  relational  features  with  respect  to  all 
regions  beforehand,  even  though  most  of  them  are  unnecessary.  Since  the  calculation  of  a 
relational  feature  is  expensive,  it  is  desirable  to  reduce  the  amount  of  calculation  by  using  demon 
functions  which  calculate  those  features  only  when  they  are  actually  required. 

3.2  Event  Object 

Event  objects  are  used  to  convert  nodes  of  an  interpretation  tree  into  executable  modules  for 
feature  matching  and  attitude  determination.  There  are  two  kinds  of  features  to  be  used  for 
matching;  unitaiy  features  such  as  area  or  moment  and  relational  features  such  as  region-region 
relation  or  region-edge  relation.  We  convert  a  node  for  a  unitary  feature  into  an  object  wh  :h 
chooses  one  of  the  descendant  nodes  simply  based  on  the  value  of  the  unitary  feature  of  a  regie* 
On  the  other  hand,  we  will  convert  a  node  for  a  relational  feature  into  an  object  which  examine 
the  similarity  of  the  relational  feature  to  all  possible  cases  and  determines  the  node 
ccn&spcnviirig  to  tiic  incst  iikcly  cssc. 

3.2.1  Unitary  feature  object 

When  a  node  of  an  interpretation  tree  is  required  to  examine  a  unitaiy  feature,  an  unitary 
feature  object  is  generated  and  attached  to  the  node.  A  node  of  an  interpretation  tree  contains  the 
information  about  descendant  nodes,  the  name  of  unitaiy  feature  used  for  matching,  and  its 
threshold  value.  According  to  these  pieces  of  information,  a  unitary  feature  object  is  generated. 
Thus,  the  prototj’pe  of  a  unitary  feature  object  in  the  object  library  has  the  following  format. 

(vinitary-f«atur«-ob  joct 
(ia-a  program-object) 

(axacution) 

(threshold) 

(branch-left ) 

(bramch-right) ) 

When  an  instance  unitary  feature  object  is  generated,  it  contains  a  method  name  in  the 
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execution  slot  to  be  used  for  the  comparison,  the  threshold  value  in  the  threshold  slot.  For 
example,  if  an  interpretation  tree  requires  area  comparison  at  a  particular  node,  then  the 
following  object  will  be  generated  at  the  node. 

(branch-exanpl* - 1 

(instance  unitary-feature-object) 

(execution  area-cos^arison-method) 

(threshold  100) 

(branch-left  branch-example-lO) 

(branch-right  branch-exaa^le-11) ) 

The  threshold  value,  branch-left,  branch-right,  and  the  execution  method  name  are  obtained 
from  the  interpretation  tree  and  insened  by  this  conversion  process.  The  object  library  contains 
the  following  ftmaion. 

(deftin  area-coB^arison-aethod  (schema  slot  entry-region) 
(cond(  (tinitary-comparison 

(get -value  entry-region  ' region-area) 

(get -value  schema  'threshold)) 

(send  (get -value  schema  'branch-left) 

' execution  entry-region) ) 

(t (send  (get -value  schema  'branch-right) 

' execution  entry-region) ) ) ) 

(defun  unitary-coa^arison (arg-a  arg-b) 

(cond((>s!  arg-a  arg-b)  t)  (t  nil)))) 

The  area-comparison-method  is  invoked  by  sending  an  execution  message  to  the  object  such 


(send  ' branch-example -1  'execution  entry-region) . 

In  the  arguments  of  the  method  function,  schema  and  slot  are  the  corresponding  schema  and  slot 
which  invoke  this  function  and  inserted  by  the  system;  in  our  example,  branch-example- 1  and 
execution  are  inserted  automatically,  while  the  argtiment,  entry-region  is  given  to  this  method 
function  directly  by  the  send  function^.  Depending  on  the  result  from  unitary-comparison, 
another  execution  message  will  be  sent  either  to  branch-example-10  or  branch-example-1 1 . 


Similarly,  we  can  define  various  discrimination  functions,  where  required  functions  are 
dependent  on  the  strategy  generation.  In  the  present  implementation,  the  following  functions  are 
prepared  in  the  object  library; 


^Note  that  (get-value  entry-region  'region-area)  invokes  a  region-area  demon  function  attached  to  the  abstract- 


•  area-comparison-method, 

•  moment-comparison-method, 

•  moment-ratio-comparison-method, 

•  surface-characteristic -comparison-method, 

•  surrounding-nth-face-area-comparison-method, 

•  surrounding-nth-face-moment -comparison-method, 

•  surrounding-nth-face-moment-ratio-comparison-method, 

•  surrounding-nth-face-surface-characteristics-comparison-method. 

It  is  quite  easy  to  include  different  unitary  features.  This  only  requires  addition  of  the 
necessary  feature  matching  methods  and  the  feature  slot  with  the  feature  extraaion  demon  to  the 
library;  it  is  not  necessary  to  modify  any  other  existing  objects. 

3,2.2  Relational  feature  object 

If  a  node  of  an  interpretation  tree  is  required  to  examine  a  relational  feature,  a  parallel  tracking 
mechanism  is  adopted  which  examines  the  similarity  of  the  relational  features  of  all  immediate 
descendant  nodes  against  those  of  the  entry  region  and  sends  the  next  execution  message  to  the 
node  corresponding  to  the  highest  similarity. 

Since  the  parallel  tracking  mechanism  is  relatively  complicated,  we  divide  it  into  the  following 
four  kinds  of  objects;  a  message  handling  objea,  feature  matching  objects,  feature  matching 
demon  objects,  and  a  comparing  object.  Sec  Figure  1.  A  message  handling  object  sends 
execution  messages  to  feature  matching  objects.  A  feature  matching  object  measures  a  similarity 
between  the  feature  of  the  entry  region  and  one  of  the  model  features  with  the  help  of  a  feature 
matching  demon  objert,  and  then  sends  the  similarity  measure  to  the  comparing  object,  and  a 
finish  notice  message  to  the  message  handling  object.  Once  the  message  handling  object  receives 
all  finish  notice  messages  from  aU  feature  matching  objects,  it  invokes  the  comparing  object.  The 
comparing  object  examines  the  similarity  measures  and  sends  the  next  execution  message  to  the 
appropriate  object. 

Message  handling  object 


The  message  handling  object  controls  the  parallel  matching  mechanism.  It  sends  the  model 
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Figure  1:  Parallel  tracking  mechanism 

features  to  each  feature  matching  object  one  by  one.  The  prototype  of  the  message  handling 
obje«  has  the  following  format. 

(in«  s  8  age -handl  ing- ob  j  e  ct 
(is-a  program-object) 

(execution  message-handling-method) 

(finished-notice  finished-notice-method) 
(sending-object-list) 

(finished-object-list) 

(model-feature-list) 

(next -node -list ) 

(comparing-object) 


The  slot,  model-feature-list  contains  the  model  relational  features  given  from  the  node  of  the 
interpretation  tree.  The  slot,  sending-object-list  contains  the  feature  matching  objects,  where 
those  feature  matching  object  will  be  generated  while  the  system  convens  the  interpretation  tree 
into  an  executable  code  and  registers  them  in  this  slot,  while  the  slot,  finished-object- list  contains 
the  feature  matching  object  which  finishes  the  matching  operation  and  sends  the  notice  to  this 
object.  Once  all  model  matching  is  done,  a  comparing  object  is  invoked.  The  object  to  be 
invoked  is  stored  in  the  comparing-object  slot. 


The  object  library  contains  the  following  message-handling-method  and 
finished-notice-method. 

(defun  message-handling-mathod (schema  slot  entry-region) 

(do ( (model -list  (get -value  schema  'model-feature-list) 

(cdr  model-list) ) 

(sending-list 

(get -value  schema  'sending-object-list) 

(cdr  sending-list) ) 

(node-list  (get -value  schema  'next -node-list 

(cdr  node-list) ) ) 

( (null  model -list) ) 

(send  (car  sending-list)  'execution 
entry-region  (car  model-list) 

(car  node-list) ) ) ) 

Basically,  this  method  sends  model  relational  features  one  by  one  to  feature  matching  objects. 

In  order  to  make  a  correspondence  between  a  feature  and  the  corresponding  descendant  node, 

this  method  also  send  the  names  of  the  descendant  nodes  to  the  feature  matching  objects. 

(daftin  finishad-notica-method 

(schama  slot  sendar  antry-ragion) 

(add-valua  schama  ' finishad-objact-list  sandar) 

(cond ( (s 

(langth(gat-valuas  schema  ' finished-objact-list) ) 
(langth (gat -values  schema  ' sending-object -list) ) ) 
(send  (gat -value  schema  ' comparing-ob ject) 

' execution  antry-ragion) ) 

This  method  adds  the  senders  name  in  the  finished-object- list  eveiytime  it  receives  a  finished 
notice  from  a  feature-matching  objea.  If  all  the  feature  matching  objects,  invoked  by  this  object, 
finish  their  matching  operations,  the  message  handling  object  sends  an  execution  message  to  the 
comparing  object. 

Feature  matching  object 

The  feature  matching  object  performs  the  relational  feature  matching.  The  prototype  of  the 
feature  matching  object  has  the  following  format. 


( feature -matching-object 
(is-a  program-object) 

(execution  feature -matching-method) 

(finished-notice  finished-notice-method) 

(comparing-object) 

(me  s  sage -handl ing- ob  ject ) 

(feature-matching-demon-object) 

(node) ) 

Those  comparing-object,  message-handling-object,  and  feature-matching-demon-object 
contain  object  names  corresponding  to  those  slot  names  and  are  filled  by  the  conversion  process. 
The  slot,  feature-matching-method  contains  an  execution  method  to  examine  the  similarity 
between  the  feature  sent  by  the  message  handler  and  those  of  the  entry  region,  while  the  main 
body  of  the  calculation  is  done  by  feature-matching-demon-object.  These  methods  can  be 
represented  in  the  library  as 

(dafun  faatura-oiatching-ma'thod 

(schama  slot  antry-ragion  modal-faatura  noda) 

(naw-valua  schama  'noda  noda) 

(sand  (gat-valua  schama  ' faatura-matching-damon-ob jact) 
'axacution  antry-ragion  modal-faatura))) 

(defun  finishad-notica-mathod 
(schema  slot  score) 

(send  (gat-valua  schama  ' cos^aring-ob ject) 

' add-valua 

(gat-valua  schama  'node) 
score) ) ) 


Feature  matching  demon  object 

TTie  feature  matching  demon  object  measures  the  similarity  between  the  model-feature  and 

features  of  the  entry-region.  This  function  further  invokes  demon  functions  attached  to  the  enny 

region  to  get  either  region-region  relations  or  region-edge  relations  and,  then,  calculates  the 

similarity  measure  between  them  by  using  a  similarity  measuring  method.  The  resulting  measure 

will  be  returned  to  the  feature  matching  object  and  then  sent  to  the  comparing  object.  The 

prototypical  object  in  the  library  has  the  following  format; 

(feature-matching-demon-object 
(is-a  program-object) 

(execution) 

(feature-matching-object) ) 

The  slot,  feature-matching-object  contains  the  object  name  which  invokes  this  object.  This  will 
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be  done  by  the  conversion  process.  The  slot,  execution  contains  a  similarity  measuring  method. 

In  the  present  implementation,  the  following  two  methods  are  prepared  in  the  library. 

•  region-region  similarity  measuring  method 

•  region-edge  similarity  measuring  method 

Similarity  of  the  region-region  relational  feature  and  the  region-edge  relational  feature  are 
measured  based  on  the  voting  index.  For  relational  features,  see  Appendix  I,  and  for  voting 
indices  sec  Appendix  11.  If  a  different  similarity  measure  is  necessary,  it  is  only  necessary  to  add 
the  method  to  the  library  and  to  insert  the  method  name  into  the  execution  slot  of  this  object. 


Comparing  object 


Each  time  a  comparing  object  receives  a  message  add-score  with  the  similarity  measure  and 
the  node  from  a  feature  matching  object,  it  will  add  the  measure  to  the  score  list  and  the  node  to 
the  next  node  list.  After  the  message  handling  object  finishes  its  sending  to  the  feature  matching 
objects,  it  sends  an  execution  message  to  a  comparing  object  and  invokes  it.  The  comparing 
object  examines  the  similarity  measures  in  slot  "score-list",  chooses  the  highest  measure,  and 
sends  the  next  execution  message  to  the  node  corresponding  to  the  highest  measure.  Thus,  the 
prototype  of  the  comparing  object  has  the  following  format. 

( coa^ar«  “ob  j  ect 


(is-a  program-object) 

(execution  compare-object-aethod) 
(add-score  add- score -method) 
(score-list) 

(next -node-list) ) 


The  following  two  methods  are  also  prepared  in  the  library. 

(defun  con^are-object -method  (schema  slot  entry-region) 
(send  (the-most-highest-node 

(get -value  schema  'score-list) 

(get -value  schema  ' next -node -list) ) 

' execution  entry-region) ) 


(defun  add-score-method (schema  slot  score  node) 
(add-value  schema  'score-list  score) 
(add-value  schema  ' next -node -list  node)) 


where  the  function  the-most-highest-node  returns  the  node  in  the  n..x‘.-noJe-list  which  has  the 
highest  value  in  the  score  list. 
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3.2.3  Attitude  determination  object 

An  attitude  determination  object  is  generated  at  a  leaf  node  of  an  interpretation  tree.  At  each 
leaf  node,  the  interpretation  tree  knows  the  correspondence  between  the  image  regions  and 
model  faces,  in  particular  one  between  the  entry  region  and  the  corresponding  model  face.  If  we 
recover  the  local  coordinate  of  the  model  face  from  the  information  of  the  entry  region,  then  we 
can  obtain  the  body  coordinate  by  using  the  local  coordinate  and  the  transformation  from  the 
local  coordinate  to  the  body  coordinate  obtained  from  the  geometric  model.  In  our  system,  we 
define  the  local  z  axis  by  the  surface  orientation,  x  axis  by  the  miiumum  moment  direction  and 
visible  face  relationships.  Once  this  objea  determines  the  body  coordinate,  it  sends  the 
coordinate  to  the  verification  object. 

The  prototypical  attitude  determination  objea  has  the  following  format. 

(at.titud«-det.eraiinat  ion-object 
(is-a  program- object) 

(execution  attitude-detexmination-method) 

(transformation) 

(verification-object) ) 

3.2.4  Verification  object 

The  verification  object  is  used  to  generate  an  expected  image  and  verify  the  recognition  result. 
After  the  exact  attitude  is  determined,  the  verification  object  will  create  an  expected  image  by 
using  a  geometric  modeler.  From  the  expeaed  image,  it  will  extract  2D  edge  informations  and 
match  this  with  the  input  scene  to  confiim  the  recognition. 

(verification-ob jact 

(is-a  program-object) 

(execution  verification-method) ) 

4  Generating  an  Executable  Code  for  a  Toy  Wagon 

We  choose  a  toy  wagon  to  demonstrate  our  ideas.  We  use  a  geometric  modeler  to  generate  a 
model  of  the  toy  wagon.  Figure  2  shows  the  model  of  the  toy  wagon.  It  is  a  relatively  complex 
geometric  object.  In  order  to  derive  possible  aspects,  we  sample  possible  views  and  group  them 
into  17  aspects  based  on  the  visible  faces.  Figure  3  shows  the  given  interpretation  tree,  which 
defines  the  necessary  feature  matchings  at  each  node. 


Once  the  interpretation  tree  is  obtained,  its  nodes  arc  converted  to  objects  using  the  object 
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Figure  2:  The  model  of  a  toy  wagon 

At  the  nodes,  bl,  bll,  bill  of  the  inteipretation  tree,  one  unitary  feature  matching  node  is 
convened  into  one  unitary  feature  matching  object. 

For  example,  at  node  bl  of  the  inteipretation  tree,  the  following  objea  is  generated. 

(bl 

(•xccution  aoBksnt-conpari  son -method) 

(threshold  5000) 

(branch- left  bll) 

(branch-right  bl2) ) 

where  threshold  value  5000  is  given  from  the  interpretation  tree.  A  similar  object  is  generated  at 
bll,  bill  by  using  the  same  moment  feature  and  different  threshold  value. 

A  relational  feature  feature  is  matched  using  a  parallel  tracking  mechanism.  A  parallel 
tracking  mechanism  is  divided  into  four  objects;  message  handling  object,  feature  matching 
object,  feature  matching  demon  object,  and  compare  objects.  These  objects  are  generated  when 
a  parallel  tracking  mechanism  is  required  by  the  conversion  program. 

Those  nodes  bl2,bll2,bl22.bl21,bllll,blll2,bll21,blll21  require  relational  feature 
matching,  and  thus,  are  converted  into  objects  to  execute  the  parallel  tracking  mechanism.  Let 
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Figure  3:  Interpretation  tree  for  a  toy  wagon 
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us  consider  the  case  of  blI2,  at  which  node  a  region-region  relational  feature  is  used  in 
matching.  The  conversion  program  instantiate  one  message  handling  object  bll2,  four  feature 
matching  objects,  bll2-f-l  ,..,bll2-f-4  four  feature  matching  demon  objects, 
bll2-f-l-d,..,bll2-f-4-d  and  one  comparing  object,  bll2-c  from  those  prototypical  objects  in  the 
library. 

First,  a  message  handling  object  such  as 
(bll2 

(instance  ' message-handling-object) 

(sending-object -list 
' (bll2-£-l  bll2-£-2 
bll2-f-3  bll2-f-4) 

(finished-object-list  nil) 

(model-feature-list 

'  (((10  20  30  0.5))  .  .  .)) 

(next-node-list 

' (al3  al2  all  bll21) ) 

(coBipare-object  bll2-c) ) 

is  generated.  The  contents  in  the  model-feature-list  slot  is  obtained  from  the  relationship 
between  the  entry  region  and  surrounding  visible  regions  consulting  a  model  data  base,  and 
represent  region-region  relational  features  such  as  the  distance  between  regions  and  the 
difference  between  two  surface  normals.  More  precise  definitions  can  be  found  in  Appendix  I 
region-region  feature. 

Then,  four  feature  matching  objects  are  instantiated  from  the  prototype  in  the  object  library. 

One  of  them  looks  like  this: 

(bll2-f-l 

(instanca  faatura-matching-object) 

(cos^aring-objact  bll2-c) 

(massaga-handlar-ob jact  bll2) 
(faatura-matching-demon-objact 

bll2-f-l-d) ) 

Then  four  feature-matching-demon-objects,  instantiated  from  the  prototypical  object  in  the 
library,  have  the  same  format  as  the  feature-matching-objects. 

Then,  finally,  a  comparing  object  is  insta.ntiated. 

(bll2-f-c 

(instanca  comparing -object) 

(scora-list  nil) 

(naxt-noda-list  nil) ) 
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At  each  leaf  node,  attitude  determination  objects  and  verification  objects  are  generated.  For 
example,  at  node  a9,  the  following  two  objects  are  generated. 

(a9 

(instance  attitude-determinat ion -object) 

(transformation 

((0.0  0.0  1.0)  ...))) 

(verification-object  a9-v) ) 

(a9-v 

(instance  verification-object) ) 

Note  that  some  of  the  instance  objects  do  not  have  execution  slots,  which  are  inherited  from 
their  prototypes  in  the  object  library. 

Similar  operations  are  applied  to  all  nodes  in  the  interpretation  tree  and  give  the  executable 
program  as  shown  in  Figure  4.  This  conversion  program  is  implemented  using  a  rule 
representation  language  OPS5  [4], 

5  Running  the  Code 

This  section  shows  an  example  of  the  obtained  program  running  on  a  real  scene.  Figure  5  is 
the  input  scene  for  recognition.  Figure  7  shows  those  regions  whose  surface  orientation  can  be 
determined  as  shown  in  Figure  6  by  using  photometric  stereo.  By  using  a  dual  photometric 
stereo  system,  we  can  determine  the  depth  of  each  region.  We  also  use  an  edge  extractor.  Three 
images  obtained  under  different  lighting  conditions  are  tsrocessed.  The  resulting  edges  are 
shown  in  Figure  8.  The  system  instantiates  region  objects  and  edge  objects  for  all  the  regions 
and  edges  in  the  scene  by  using  the  abstraa-region  object  and  the  abstract-edge  object  in  the 
objea  library. 

The  largest  region  at  the  top  of  the  pile  is  selected  as  the  entry  region  (in  this  case,  region  r90 
in  Figure  5(c))  and  sent  to  bl. 

(send  'bl  'execution  entry-region) 
where  entry-region  =  R90.  Then,  since  bl’s  execution  slot  contains  the  moment-comparison- 
method,  the  moment  comparison  method  is  invoked.  This  funaion  sends  a  message  to  region 
R90  to  get  region-moment,  which  can  be  calculated  by  the  region-moment  demon  function  and 
the  moment  value  of  R90.  Notice  here  that  the  moment  in  an  image  is  converted  into  a  moment 
value  in  the  geometric  model  by  the  demon  function.  See  Figure  9. 
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Figure  4:  An  executable  program  represented  by  objects:  A  U  node  represents 
an  unitary  feature  naatching  object;  A  M  node  represents  a  message  handling 
object;  A  F  node  represents  a  relational  feature  matching  object;  A  C  node 
represents  a  conoparing  object;  An  A  node  represents  a  attitude  determining 
object;  A  V  node  represents  a  verificanon  object 
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Figure  9:  Retrieving  feature  value  from  a  region. 


From  the  comparison  between  the  threshold  value  in  the  unitary  feature  matching  object,  bl 
and  the  feature  value  obtained  from  R90,  the  object  sends  an  execution  message  and  the  entry' 
region  xobll.  The  object  bll  repeats  the  similar  operation  and  sends  an  execution  message  and 
the  entry  region  to  bill.  Since  bill  is  a  message  handling  object,  it  send  messages  bll2-f-l, 
bl  12-f-2,bll2-f-3,bll2-f-4,  one  by  one  with  model  relational  features.  At  each  feature  matching 
object,  a  similarity  measure  for  the  region-region  relational  feature  obtained  from  the  region- 
region  relationship  iR90  and  R85)  against  one  model  relational  feature,  is  obtained  and  sent  to 
the  comparing  object,  bll2-c.  From  the  accumulated  score,  the  comparing  object,  bll2-c  send 
an  execution  message  to  node  all  with  the  entry  region. 

At  this  point,  the  system  finds  the  correspondence  between  the  entry  region  and  the  roof  face 
of  the  toy  wagon.  The  attitude  determination  object  then  determines  the  local  coordinates  of  the 
face  by  using  the  surface  normal,  the  minimum  moment  direction,  and  the  region-region  relation 
between  R90  and  R85.  The  bold  lines  in  Figure  10  indicate  the  tracks  of  the  message  passings. 
Finally,  the  body  coordinates  are  recoverted  using  the  local  coordinates  and  the  transformation 
between  the  roof  face  of  the  toy  wagon  and  the  body  coordinates.  The  attitude  determination 
object,  all  sends  an  execution  message  to  all-v  with  the  entry  region  and  the  body  coordinates. 

The  verification  object,  all-v  generates  an  expeaed  image  (Figure  11)  by  using  a  geometric 
modeler  based  on  the  body  coordinates,  extracts  edges  from  the  expected  image  which  are  longer 
than  a  cenain  threshold,  and  compares  them  with  the  edges  from  the  line  finder.  The  result  is 
shown  in  Figure  10(c),  where  the  bold  lines  indicate  the  expected  edges  and  thin  lines  indicate 
the  image  edges.  The  voting  index  obtained  from  this  matching  represents  the  reliability  of  the 
recognition.  For  this  example,  the  reliability  of  the  recognition  is  0.8. 


6  Conclusion 

This  paper  has  discussed  how  various  modules  are  prepared  and  used  for  generating  a 
recognition  program  from  a  given  interpretation  tree  so  that  we  can  generate  a  recognition 
program  from  a  geometric  model  automatically.  We  designed  the  module  set  as  the  object 
library  using  object-oriented  programming.  The  object-oriented  programming  paradigm  provides 
modularity  and  extensibility  to  the  object  library.  The  objects  in  the  object  library  are  divided 
into  two  categories:  data  objects  and  event  objeas.  A  data  object  is  used  for  representing 
geometric  objects  and  extracting  features  from  geometric  objects.  An  event  object  is  used  for 
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Figure  12;  Verification  by  the  extracted  edges 


feature  matching  and  attitude  determination.  Wc  generate  an  executable  program  by  properly 
selecting  and  instantiating  modules  from  the  object  library.  This  method  has  been  ^plied  to  the 
generation  of  a  recognition  program  for  a  toy  wagon.  The  generated  program  has  been  tested 
with  real  scenes  and  has  recognised  the  wagon  in  a  pile.  The  generation  method  developed  here 
provides  a  useful  tool  for  the  automatic  generation  of  recognition  programs. 
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I.  Relational  Features 


1.1  Region-region  Relational  Feature 

The  relationship  between  two  regions  can  be  described  as  (Figure  I-l): 

•  d  ;  The  distance  between  the  mass  centers  of  two  regions. 

•  a  ;  The  angle  between  the  minimum  moment  directions  of  two  regions. 

•  P  :  The  angle  between  the  surface  orientations  of  two  regions. 

•  A  :  The  area  of  the  region  other  than  the  entry  region. 

We  form  a  four-dimensional  feature  vector  (d  a  p  A)  to  represent  the  relational  feature 
between  the  entry  region  and  the  other  region.  A  demon  function  will  be  invoked  when  feature 
extraction  is  requested.  Then  a  set  of  feature  vectors  relative  to  the  entry  region  are  found.  These 
feature  vectors  will  be  used  in  feature  matching. 

In  the  four-dimensional  feature  space  (d  a  P  A),  we  test  a  hypothesis  by  comparing  all  the 
feature  vectors  with  the  predicted  feature  vector  that  is  generated  by  the  model.  If  they  are  close 
in  the  four-dimensional  feature  space,  we  accept  this  hypothesis,  ~nd  conclude  the  feature 
matching  process.  If  the  matching  fads,  we  then  reject  the  hypothesis,  and  generate  another 
hypothesis.  This  hypothesis  generation  and  test  can  be  done  by  using  a  parallel  tracking  schema. 

1.2  Region-edge  Relational  Feature 

We  use  a  line  fmder  to  obtain  2-D  information  about  an  edge  from  its  projection  onto  the 
image  plane.  In  order  to  recover  the  3-D  information  about  an  edge,  we  will  transform  the  2-D 
edge  into  3-D  space  via  an  affine  transformation.  Let  the  surface  orientation  of  the  entry  region 
be  (p  q),  where  p=  and  q=  Uy/u^.  The  affine  transformation  P  will  transform  an  edge 

surrounding  the  entry  region  to  the  3-D  plane  that  the  edge  lies  on. 

\l+p^  pqHl+p^ 

P=  0  Vl+p^+^/2/Vl+p2 

0  0 


Figure  1-2  shows  the  X-view  of  an  affine  transformation.  The  new  view  direction  is  on  the  Z’ 
axis.  We  can  determine  the  original  length  of  an  edge  from  this  view  direction. 
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Figure  1-2:  X-vicw  of  affine  transform 

After  the  affine  transformation,  we  can  use  four  parameters  to  describe  the  relationship 
between  an  edge  and  the  entry  region  (Figure  1-3). 
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Figure  1-3:  Relation  between  edge  and  region 
•  r :  The  perpendicular  distance  between  an  edge  and  a  region. 
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II.  Voting  Index 

We  generate  region -edge  relational  features  for  the  edges  within  a  cenain  distance  from  the 
entry  region.  These  features  will  be  compared  with  the  model’s  region-edge  features  generated 
in  advance.  The  length  of  an  edge  in  the  scene  is  considered  as  a  vote  for  the  presence  of  a 

model  edge  if  the  following  conditions  are  satisfied  : 

•  The  value  r  of  the  edge  is  within  a  cenain  range  of  the  model  edge  r^,  or  say, 

0.9r^<r<l.lr„. 

•  The  value  '  of  the  edge  is  within  a  cenain  range  of  the  model  edge  0^,  say, 

-0.2+0^  <  0  <  0.2  -h0^. 


•  The  value  of  co  of  the  edge  is  within  the  maximum  and  minimum  values  of  the 
model  edge  co^ ; 
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•  The  length  1  of  the  edge  is  less  than  the  length  1,^  of  the  model  edge. 


N  V 


•  0  :  The  angle  from  the  minimum  moment  direction  of  a  region  to  the  perpendicular 
line  of  an  edge. 

•  (0  :  The  angle  from  the  minimum  moment  direction  of  a  region  to  the  middle  line  of 
an  edge. 

•  1 :  The  length  of  the  edge  after  an  affine  transformation. 


For  all  the  edges  within  the  search  distance,  we  generate  feature  vectors  relative  to  the  entry 
region.  For  each  model  edge,  we  search  the  feature  vectors  in  the  scene  to  find  the  voting  index 
[3].  The  summation  of  the  voting  index  is  compared  with  the  total  length  of  the  model  edges 
that  surround  the  entry  region.  If  the  values  are  close,  then  we  conclude  this  matching  is 
successful,  otherwise  we  reject  this  hypothesis  and  generate  another  hypothesis. 
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